2016 Texas Election Contributions by Ana Javed:

The 2016 U.S. Presidential Election brought several unexpected results that reshaped the vision many Americans had for the country. States were closely monitored by analysts to gauge whether they would swing in favor of Hillary R. Clinton (D) or Donald Trump (R). Texas was particularly unique in this election because, despite voting consistently Republican since 1980, many Texan voters did not default to Republican party candidate. Even though Donald Trump won Texas by the end, it occurred on a slimmer margin than the previous 2012 election. By analyzing campaign contributions however, I strive to achieve a better understanding of how donations may have reflected the political atmosphere of Texas at the time of the election.

Univariate Plots Section

The dataset was obtained from the Federal Election Commission site. It has 19 variables and almost 54,000 observations. Below are the dimensions and summary of the data.

## [1] 539546     19
##       cmte_id            cand_id                            cand_nm      
##  C00575795:203851   P00003392:203851   Clinton, Hillary Rodham  :203851  
##  C00574624:138805   P60006111:138805   Cruz, Rafael Edward 'Ted':138805  
##  C00577130: 79954   P60007168: 79954   Sanders, Bernard         : 79954  
##  C00580100: 69288   P80001571: 69288   Trump, Donald J.         : 69288  
##  C00573519: 23692   P60005915: 23692   Carson, Benjamin S.      : 23692  
##  C00458844:  8983   P60006723:  8983   Rubio, Marco             :  8983  
##  (Other)  : 14973   (Other)  : 14973   (Other)                  : 14973  
##                 contbr_nm           contbr_city     contbr_st  
##  RUDOLPH, BONNIE     :   463   HOUSTON    : 69946   TX:539546  
##  SCOTT, MELVIN       :   304   AUSTIN     : 55523              
##  SCHORLEMER, DAVID S.:   257   DALLAS     : 40143              
##  PHAN, JULIE         :   245   SAN ANTONIO: 28860              
##  GILLIS, DEBORAH DEE :   218   FORT WORTH : 14134              
##  WEST, LOIS          :   216   PLANO      :  8605              
##  (Other)             :537843   (Other)    :322335              
##      contbr_zip          contbr_employer               contbr_occupation 
##  77024    :   773   RETIRED      :102975   RETIRED              :140646  
##  77379    :   517   N/A          : 55714   NOT EMPLOYED         : 23085  
##  75225    :   504   SELF-EMPLOYED: 35768   INFORMATION REQUESTED: 16537  
##  78633    :   471   SELF EMPLOYED: 19031   ATTORNEY             : 14050  
##  780451915:   455   NONE         : 18012   HOMEMAKER            : 11278  
##  75093    :   438   (Other)      :307561   (Other)              :333813  
##  (Other)  :536388   NA's         :   485   NA's                 :   137  
##  contb_receipt_amt   contb_receipt_dt 
##  Min.   :-16600.0   12-JUL-16:  5816  
##  1st Qu.:    20.0   11-JUL-16:  5494  
##  Median :    38.0   29-FEB-16:  5476  
##  Mean   :   139.1   05-APR-16:  4681  
##  3rd Qu.:   100.0   31-MAR-16:  4650  
##  Max.   : 15000.0   02-MAY-16:  4279  
##                     (Other)  :509150  
##                            receipt_desc    memo_cd   
##                                  :522170    :437983  
##  Refund                          :  3631   X:101563  
##  REDESIGNATION TO GENERAL        :  2927             
##  REDESIGNATION FROM PRIMARY      :  2922             
##  REDESIGNATION TO CRUZ FOR SENATE:  1764             
##  REATTRIBUTION TO SPOUSE         :  1311             
##  (Other)                         :  4821             
##                                memo_text       form_tp      
##                                     :410387   SA17A:447296  
##  * EARMARKED CONTRIBUTION: SEE BELOW: 78429   SA18 : 88619  
##  * HILLARY VICTORY FUND             : 33940   SB28A:  3631  
##  REDESIGNATION TO GENERAL           :  2927                 
##  REDESIGNATION FROM PRIMARY         :  2922                 
##  REDESIGNATION TO CRUZ FOR SENATE   :  1764                 
##  (Other)                            :  9177                 
##     file_num               tran_id       election_tp       X          
##  Min.   :1003942   SA17.1135539:     4        :  2252   Mode:logical  
##  1st Qu.:1077404   SB28A.1269  :     4   G2016:162833   NA's:539546   
##  Median :1091720   C10688898   :     3   P2012:     2                 
##  Mean   :1091567   C10859784   :     3   P2016:374459                 
##  3rd Qu.:1112134   C10868587   :     3                                
##  Max.   :1134173   C10937433   :     3                                
##                    (Other)     :539526

Breakdown by Candidates:

The bar graph displays the amount of donations each candidate received. Candidates with donations greater than 1,000 were only shown on the plot. Hillary Clinton received the most donations across Texas with over 200,000 donations. Ted Cruz came second, with Bernie Sanders and Donald Trump following. Ted Cruz was a Texan Senator before the election, which can explain his popularity and number of donations received. I am curious to see the totals for all the candidate’s donations.

##           Kasich, John R.             Johnson, Gary 
##                      1187                      1336 
##            Fiorina, Carly                Paul, Rand 
##                      2535                      3035 
##                 Bush, Jeb              Rubio, Marco 
##                      3578                      8983 
##       Carson, Benjamin S.          Trump, Donald J. 
##                     23692                     69288 
##          Sanders, Bernard Cruz, Rafael Edward 'Ted' 
##                     79954                    138805 
##   Clinton, Hillary Rodham 
##                    203851

Breakdown by Cities:

The top cities were donations were most made included: Houston, Austin, Dallas, San Antonio, Fort Worth, and Plano. Houston is the most populated city in Texas, with Austin in second, which could explain the high number of donations in these cities. I am interested in finding how each city leans politically.

##         IRVING       AMARILLO  THE WOODLANDS     GEORGETOWN        MIDLAND 
##           3562           3682           3772           3775           3879 
##         FRISCO CORPUS CHRISTI     SUGAR LAND        LUBBOCK           KATY 
##           4085           4504           4990           5163           5926 
##      ARLINGTON        EL PASO         SPRING          PLANO     FORT WORTH 
##           7101           7564           8017           8605          14134 
##    SAN ANTONIO         DALLAS         AUSTIN        HOUSTON        (Other) 
##          28860          40143          55523          69946         117729

Breakdown by Donation Levels:

After creating a new variable “Donation_Level”, I grouped the donation amounts into 5 levels: “$200 and Under”, “$200.1-499”, “$500-999”, “$1000-1999”, and “$2000 and over”. There were overwhelmingly more donations under $200, as demonstrated by the first plot. Then I was interested to demonstrate which types of donations candidates received most.

In summary, Hillary Clinton received over 175,000 donations of $200 or less. For donations between $500-999, Donald Trump received over 10,500 donations. As the donation value increased, both Hillary Clinton and Ted Cruz continued to compete for the most donations.

## $200 and Under    $200.1 -499       $500-999     $1000-1999 $2000 and over 
##         471235          33724          13193           9287          12107 
##           TRUE 
##              0

Donations Under $200: It is apparent that the top 5 candidates receiving donations valued $200 or less are Hillary Clinton (D), Ted Cruz (R), Bernie Sanders (D), Donald Trump(R), and Ben Carson(R).

Donations between $200.1-499: For donations valued higher, Donald Trump (R) received the most with Hillary Clinton (D), Ted Cruz (R), Ben Carson (R), and Bernie Sanders (D) following. It is interesting to note that even though Texas is overall considered a “red” or Republican state, Hillary Clinton is still receiving similar if not more donations than her political counterparts.

Donations between $500-999: With higher donation amounts, Donald Trump (R), Ted Cruz (R), and Hillary Clinton (D) still have the highest donation counts. Other candidates such as Bernie Sanders (D) and Ben Carson (R) received drastically less donations.

Donations between $1000-1999: It is important to notice that donations to Donald Trump(R) have decreased, with Hillary Clinton(D) and Ted Cruz(R) receiving the most donations in this category.

Donations over $2000: Ted Cruz(R) and Hillary Clinton(D) remain close in donations in this category as well, with donations to Donald Trump (R) decreasing.

Breakdown by Contributors:

After grouping the data based off of individual contributor information, the plot displays how many donations each donor contributed. After creating the first plot, I adjusted the y axis by log10, and the x axis with limits. There is one noticeable outlier that is over 60,000 donations.

Breakdown by Contributor Occupations:

A contributor can be influenced politically by their lifestyles and occupations. A person’s occupation influences how much they can donate. Here is a break down of which occupations donated the most and the least. Stats of the top 10 occupations are listed as well. It is also apparent that those who are retired contributed in the highest of numbers.

##                                RETIRED 
##                                 140646 
##                           NOT EMPLOYED 
##                                  23085 
##                  INFORMATION REQUESTED 
##                                  16537 
##                               ATTORNEY 
##                                  14050 
##                              HOMEMAKER 
##                                  11278 
##                               ENGINEER 
##                                   8933 
##                              PHYSICIAN 
##                                   8707 
##                                TEACHER 
##                                   7968 
##                                  SALES 
##                                   6831 
## INFORMATION REQUESTED PER BEST EFFORTS 
##                                   6263

Univariate Analysis

What is the structure of your dataset?

The dataset, TX, contains 19 variables with nearly 54,000 observations.

What is/are the main feature(s) of interest in your dataset? What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

I am very interested in examining if living in a particular city influences who a contributor donates to. I also want to further explore the backgrounds of the contributors (such as amount donated, their occupation, and the number of donations), which will shed more light on the voters and donors in Texas.

Did you create any new variables from existing variables in the dataset?

Yes, as mentioned above, I created the donation_level variable to group donation values in certain bins.

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?

I applied log10 to the y axis of the last plot, which examined the number of donations each contributor made. The original graph was extremely skewed to the right and I wanted to have more insight on the plotted data. I also adjusted the x-axis using coord_cartesian().

Bivariate Plots Section

Examining Candidates Donation Amounts & Means

After exploring candidate donations by certain buckets, I revised the bar graph to better visualized the amount of donations recevied for the top 4 candidates.

I also included box plot of the candidates that show summaries of their contribution amounts. Applied log10 to the y axis to easily compare the ranges of donation amounts. Some findings include Donald Trump having the highest donation value median. This was expected since Donald Trump was receiving higher valued donations in greater numbers than lower valued ones.

The plots below explore means donations and number of donations, after accounting for the different candidates. Correlation for the two variables was -0.383, which can indicate a small correlation. As number of donations increased, mean_donation decreased.

## 
##  Pearson's product-moment correlation
## 
## data:  tx_candidates$number_of_donations and tx_candidates$mean_donation
## t = -1.9938, df = 23, p-value = 0.05816
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.67641730  0.01325091
## sample estimates:
##      cor 
## -0.38389

Examining Cities’ Donation Amounts & Means

Instead of accounting for candidates as I did in the previous plot, I wanted explore if the same negative trend occurred when I accounted for the different cities. In this plot, the calculated correlation was much smaller at -0.0088.

## 
##  Pearson's product-moment correlation
## 
## data:  tx_cities$number_of_donations and tx_cities$mean_donation
## t = -0.41871, df = 2220, p-value = 0.6755
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.05045083  0.03270921
## sample estimates:
##          cor 
## -0.008886173

Cities Support for Presidential Candidates

It was determined earlier that Hillary Clinton received majority of the donations, with Donald Trump coming fourth. Since these two candidates were opponets in the Presidential election, I wanted to further explore the breakdown by Texan cities.

Mean donations for Hillary Clinton and Donald Trump:

## [1] 91.74212
## [1] 31.18272

Contributions based on Occupation

I was eager to visualize the relationship between total_employees and contributions_total. This is becuase different occupations have varying salaries and impact how much an individual contributes, for example teachers would have less to contribute than a CEO. To my surprise, the plot displayed that as total_employees increased, so did the contribution_total (with a correlation of 0.9295151).

## 
##  Pearson's product-moment correlation
## 
## data:  tx_occupations$total_employees and tx_occupations$contributions_total
## t = 330.18, df = 17161, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9274517 0.9315219
## sample estimates:
##       cor 
## 0.9295151

Bivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset? Did you observe any interesting relationships between the other features (not the main feature(s) of interest)? What was the strongest relationship you found?

I wanted to measure how number of donations & mean donations were influenced when accounting for a particular candidate or a contributors city. In the candidates, there was a small but meaningful relationship between the two variables. As number of donations increased, mean donations decreased for candidates. A possible explanation for this is that low-value donations are occurring at higher numbers versus one high-valued donation. This was also examined earlier in the analysis, when it was determined majority of donations were $200 or less. When measuring the same variables but accounting for different cities, another pattern was produced. The correlation was closer to 0, and majority of the mean donation values were 250 or less, despite number of donations made.

Another set of variables I examined were cities support for particular candidates. It was determined earlier that Houston, Austin, and Dallas had the highest number of donations, but I wanted to explore in favor of which candidate. To my surprise, for each of those cities, contributor’s donated substantially more to Hillary Cliniton than Donald Trump - despite Donald Trump winning the popular vote in Texas. Lastly, I create a scatter plot for number of employees per occupation vs. the total contributions made. Log10 x & y scales were applied to better visualize the data, and the plot showed how as total employees increased, so did the contribution amount. There was a 0.92 pearson correlation for the trend.

Multivariate Plots Section

Occupation, Number of Donations, & Total Number of Employees

I took the last bivariate plot and applied the number of donations variable to examine if it could explain the high correlation between contribution totals and total employees. I did not apply the alpha parameter, to better visualize the color ranges on the plot. It is apparent that number of donations did not have much impact on the variables since there is such little variation of colors.

Number of Donations and Mean Donation Per City

For each city, I want to see what the mean_donation & it’s relationship with the number of donations. Houston had the most donations but a lower mean than Dallas.

Multivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

For the first graph, the variables investigated were total_employees, contribution_total, and number_of_donations. In the last section, total_employees and contribution_total were determined to have a strong relationship, and I wanted to see if the number_of_donations had any influence. Upon further examination, there was very little variation in color (relation to number_of_donations) which meant there was little influence by the variable.

Were there any interesting or surprising interactions between features?

The second graph analyzed number_of_donations, contbr_city, and mean_donation. I was surprised to see much variation between number_of_donations and mean_donations. For example, Houston had the most donations, but had a higher mean than Austin (which came second in number od donations). I would be eager to compare this graph with mean salaries for individual cities in Texas.


Final Plots and Summary

Plot One

Description One

This bar graph examines how much each candidate received of the different donation values. The donation amount most given by contributors were $200 and Under, and Hillary Clinton and Ted Cruz were the top 2 candidates receiving these donation types. Ted Cruz and Hillary Clinton continue to be popular candidates to receive donations, however Donald Trump recieved the most donations valued $200.01-499 than the other candidates - even though they were substantially less occurring. In all four bars, Bernie Sanders does not receive the most donations, however Bernie Sanders received more donations than Donald Trump when totaling donations.

Plot Two

Description Two

The plot examines the relationship between number of employees and number of donations made in a given occupation. As the number of employees in a profession increased, the total donation amount also increased. It’s important to note that all occupations were included (of varying salaries) when making the plot, which could mean that salaries did not influence donation amount. More data regarding contributor’s salary would be required to formalize this finding.

Plot Three

Description Three

The final variable that was explored further were Texan cities. Total donations and mean donations were examined for each city. While understanding each city has unique population counts and cultures, the results showed that cities contributed different number of donations and had varying means. Midland, for example, had under 5000 donations, but mean donation amount was around $300. Houston, on the other hand, had over 60,000 donations but it’s mean donation was around $200.


Reflection

When conducting my analysis, I initially ran into some difficulties dealing with mostly qualitative data. The cities, candidates, contributor names, and contributor occupations are some of the variables I had extensive data on. I found a lot of success when I created the new “donation_level” variable to group the various donation amounts. It was easier to conduct analysis across the other variables with a cleaned variable. That data could be enriched with additional information about the contributors, such as salary amount and usual poltical alliance. I would not be surprised if usual Republican voters, voted for Hillary Clinton for this particular election.

Sources:

http://www.270towin.com/states/Texas http://fec.gov/disclosurep/pnational.do